Conversation
Co-authored-by: David Bayer <48736217+davebayer@users.noreply.github.com>
…a-cuda Linker to link LTO (NVIDIA#7011) Co-authored-by: Ashwin Srinath <shwina@users.noreply.github.com>
This allows us to use it independently
…NVIDIA#7024) Co-authored-by: pciolkosz <pciolkosz@nvidia.com>
* Rework hierarchy levels * add missing launches to native cluster level queries * remove dependency on runtime storage --------- Co-authored-by: pciolkosz <pciolkosz@nvidia.com>
* Remove _view from the shared memory getter * Forgot about cudax
* Ignore CUDA free errors in thrust memory resource * Add a comment
* Don't set current device in CUDA 13 and handle extended lambda * Add extended lambda test * Compiler workarounds * Waive extended lambda test on NVRTC * Apply suggestion from @davebayer --------- Co-authored-by: David Bayer <48736217+davebayer@users.noreply.github.com>
…regardless of exception support (NVIDIA#7028) Co-authored-by: David Bayer <48736217+davebayer@users.noreply.github.com>
This comment has been minimized.
This comment has been minimized.
|
@leofang wrote:
A "custom layout mapping" is a user-defined type that meets the layout mapping requirements. This file in the reference mdspan implementation's tests has examples. It should be possible for us to write a custom layout mapping that supports arbitrary DLPack layouts. It would need to store an offset as well as strides, so that negative strides would still result in a nonnegative mapping result.
The following paper explains these issues: https://isocpp.org/files/papers/P3959R0.html . |
|
Here is a rough draft of a custom layout mapping that would support DLPack's layout (including zero and negative strides): https://godbolt.org/z/WEYazcsxT . I've commented out namespace std {
// work around issue in single header reference implementation
using ::std::experimental::dims;
namespace impl {
// [mdspan.layout.stride.expo] 2 defines OFFSET(m).
// It's only ever applied to strided layout mappings.
template<class Mapping>
requires(Mapping::is_always_strided())
constexpr auto offset(const Mapping& m) {
if constexpr (typename Mapping::extents_type::rank() == 0) {
return m();
}
else {
using index_type = typename Mapping::index_type;
constexpr auto rank = typename Mapping::extents_type::rank();
bool any_zero = false;
for (::std::size_t r = 0; r < rank; ++r) {
any_zero = any_zero || (m.extents().extent(r) == 0);
}
if (any_zero) {
return index_type(0);
}
else {
constexpr auto zeros =
[]< ::std::size_t... Rs>(std::index_sequence<Rs...>) {
return std::tuple{((void) Rs, index_type(0))...};
} (std::make_index_sequence<rank>());
return std::apply(m, zeros);
}
}
}
}
class layout_stride_relaxed {
public:
template<class Extents>
class mapping;
};
template<class Extents>
class layout_stride_relaxed::mapping {
public:
using extents_type = Extents;
using index_type = extents_type::index_type;
using size_type = extents_type::size_type;
using rank_type = extents_type::rank_type;
using layout_type = layout_stride_relaxed;
private:
static constexpr rank_type rank_ = extents_type::rank();
public:
constexpr mapping() noexcept {
const layout_right::mapping<extents_type> map{};
for (std::size_t d = 0; d < rank_; ++d) {
strides_[d] = map.stride(d);
}
}
constexpr mapping(const mapping&) noexcept = default;
template<class OtherIndexType>
requires(
::std::is_convertible_v<const OtherIndexType&, ::std::intptr_t> &&
::std::is_nothrow_constructible_v<::std::intptr_t, const OtherIndexType&>
)
constexpr mapping(
const extents_type& e,
::std::span<OtherIndexType, rank_> s,
::std::size_t offset = 0) noexcept
: extents_(e), offset_(offset)
{
for (::std::size_t d = 0; d < rank_; ++d) {
strides_[d] = s[d];
}
}
template<class OtherIndexType>
requires(
is_convertible_v<const OtherIndexType&, ::std::intptr_t> &&
is_nothrow_constructible_v<::std::intptr_t, const OtherIndexType&>
)
constexpr mapping(
const extents_type& e,
const ::std::array<OtherIndexType, rank_>& s,
::std::size_t offset = 0) noexcept
: extents_(e), offset_(offset)
{
for (::std::size_t d = 0; d < rank_; ++d) {
strides_[d] = s[d];
}
}
// m IS a layout_stride_relaxed::mapping
template<class OtherMapping>
requires(
detail::layout_mapping_alike<OtherMapping> &&
::std::is_constructible_v<
extents_type,
typename OtherMapping::extents_type
> &&
::std::is_same_v<
layout_type,
typename OtherMapping::layout_type
>
)
constexpr explicit(
! (
::std::is_convertible_v<
typename OtherMapping::extents_type, extents_type
>
)
)
mapping(const OtherMapping& m) noexcept
: extents_(m.extents()), offset_(m.offset_)
{
for (std::size_t d = 0; d < rank_; ++d) {
strides_[d] = m.stride(d);
}
}
// m is NOT a layout_stride_relaxed::mapping
template<class StridedLayoutMapping>
requires(
detail::layout_mapping_alike<StridedLayoutMapping> &&
::std::is_constructible_v<
extents_type,
typename StridedLayoutMapping::extents_type> &&
StridedLayoutMapping::is_always_unique() &&
StridedLayoutMapping::is_always_strided()
)
constexpr explicit(
! (
::std::is_convertible_v<
typename StridedLayoutMapping::extents_type, extents_type
> && (
detail::is_mapping_of<layout_left, StridedLayoutMapping> ||
detail::is_mapping_of<layout_right, StridedLayoutMapping> ||
experimental::detail::is_layout_left_padded_mapping<
StridedLayoutMapping>::value ||
experimental::detail::is_layout_right_padded_mapping<
StridedLayoutMapping>::value ||
detail::is_mapping_of<layout_stride, StridedLayoutMapping>
)
)
)
mapping(const StridedLayoutMapping& m) noexcept
: extents_(m.extents())
{
for (std::size_t d = 0; d < rank_; ++d) {
strides_[d] = m.stride(d);
}
}
constexpr mapping& operator=(const mapping&) noexcept = default;
// [mdspan.layout.stride.obs], observers
constexpr const extents_type& extents() const noexcept {
return extents_;
}
constexpr ::std::array<index_type, rank_> strides() const noexcept {
return strides_;
}
constexpr ::std::intptr_t offset() const noexcept {
return offset_;
}
constexpr index_type required_span_size() const noexcept {
// The dot product of indices and strides is linear.
// Thus, over all valid indices, the max value of the
// dot product is achieved at the extrema: either the
// min index (0) if the stride is negative, or the max
// index (extent(r) - 1) if the stride is nonnegative.
std::array<index_type, rank_> max_indices{};
for (std::size_t r = 0; r < rank_; ++r) {
const index_type ext = extents_.extent(r);
const index_type ext_minus_1 =
ext == 0 ? index_type(0) : ext - index_type(1);
max_indices[r] = strides_[r] < 0 ? index_type(0) : ext_minus_1;
}
index_type dot = 0;
for (std::size_t r = 0; r < rank_; ++r) {
dot += max_indices[r] * strides_[r];
}
return offset() + dot;
}
template<class... Indices>
requires(
sizeof...(Indices) == rank_ &&
(::std::is_convertible_v<Indices, index_type> && ...) &&
(::std::is_nothrow_constructible_v<index_type, Indices> && ...)
)
constexpr index_type operator()(Indices... inds) const noexcept {
return offset() +
[&, this]<::std::size_t... Rs>(::std::index_sequence<Rs...>) {
return ((inds...[Rs] * strides_[Rs]) + ... + index_type(0));
} (::std::make_index_sequence<rank_>());
}
static constexpr bool is_always_unique() noexcept { return false; }
static constexpr bool is_always_exhaustive() noexcept { return false; }
// It's technically NOT always strided, because of the offset
// (to accommodate negative strides)
static constexpr bool is_always_strided() noexcept { return false; }
constexpr bool is_unique() noexcept {
// The Standard doesn't require that this be exact.
// Possibility of negative strides with an offset
// makes that harder to figure out.
return false;
}
constexpr bool is_exhaustive() const noexcept {
// The Standard doesn't require that this be exact.
// Possibility of negative strides with an offset
// makes that harder to figure out.
return false;
}
constexpr bool is_strided() noexcept {
return offset_ == 0;
}
constexpr index_type stride(rank_type i) const noexcept {
return strides_[i];
}
// y is also a layout_stride_relaxed::mapping
template<class OtherMapping>
requires(
detail::layout_mapping_alike<OtherMapping> &&
rank_ == OtherMapping::extents_type::rank() &&
::std::is_same_v<layout_type, typename OtherMapping::layout_type>
)
friend constexpr bool
operator==(const mapping& x, const OtherMapping& y) noexcept {
return x.extents() == y.extents() &&
x.offset_ == y.offset_ &&
[&]<::std::size_t...Rs> (::std::index_sequence<Rs...>) {
return ((x.stride(Rs) == y.stride(Rs)) && ...);
} (::std::make_index_sequence<rank_>());
}
// y is NOT a layout_stride_relaxed::mapping but is strided.
template<class OtherMapping>
requires(
detail::layout_mapping_alike<OtherMapping> &&
rank_ == OtherMapping::extents_type::rank() &&
OtherMapping::is_always_strided()
)
friend constexpr bool
operator==(const mapping& x, const OtherMapping& y) noexcept {
return x.extents() == y.extents() &&
impl::offset(y) == x.offset_ &&
[&]<::std::size_t...Rs> (::std::index_sequence<Rs...>) {
return ((x.stride(Rs) == y.stride(Rs)) && ...);
} (::std::make_index_sequence<rank_>());
}
private:
extents_type extents_{};
std::intptr_t offset_ = 0;
array<std::intptr_t, rank_> strides_{};
#if 0
// [mdspan.sub.map], submdspan mapping specialization
template<class... SliceSpecifiers>
constexpr auto submdspan-mapping-impl(SliceSpecifiers...) const
-> /* see-below */;
template<class... SliceSpecifiers>
friend constexpr auto submdspan_mapping(
const mapping& src, SliceSpecifiers... slices) {
return src.submdspan-mapping-impl(slices...);
}
#endif // 0
};
} // namespace std
int main() {
std::dims<3> exts(3, 5, 11);
std::array<std::intptr_t, 3> strides{0, 1, 5}; // broadcasting
std::layout_stride_relaxed::mapping<std::dims<3>> map(exts, strides);
assert(map(0, 1, 1) == map(1, 1, 1));
return 0;
} |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Co-authored-by: David Bayer <48736217+davebayer@users.noreply.github.com>
This comment has been minimized.
This comment has been minimized.
🥳 CI Workflow Results🟩 Finished in 1h 41m: Pass: 100%/84 | Total: 1d 03h | Max: 1h 40m | Hits: 97%/199140See results here. |
Description
The PR implements conversion utilities that take a DLTensor view and produce a (host/device/managed) mdspan of the same underlying memory.
The opposite conversion is implemented in mdspan to DLPack #7027. #7027 is also a prerequisite of this PR.
Todo: